ImproveOSM: Using Big Data to Improve the Map by Brian Lau Live captioning by Norma Miller. @whitecoatcapxg >> To be honest, this is my first time to the State of the Map, and I guess I'm new in OpenStreetMap. Funny story, during my interview, I was asked what's OSM, and the best answer I could come up with is, "It's a Wikipedia version for maps?" And I think I hopefully I got the answer right and that's why I'm here today. So today I'm going to talk ImproveOSM, but we also have another talk tomorrow, because we're doing -- we have two exciting things that we're doing, and for the next -- for today's session, and for tomorrow, we will go into a little bit of detail. Hopefully you will all be familiar with it and have time to use it. Just a little bit about Telenav, since 2014, we've been thousands of tests and from that we're able to add thousands of roads and do corrections and normalizations. So here again we've added millions of missing roads, just from Telenav involvement with OSM. We've added thousands of turn restrictions and we were able to contact local communities, local cities, to collect road data from them, from public sources, and I've contacted personally myself a dozen of them, and they're all actually very nice. If you go to local city's website and they usually have a GIS link and if you want to ask for their permission, assuming they have public data such as new roads are added in neighborhoods, they're usually very quick to reply and they are very -- and they're a hundredpercent of the time they allow you to take the data and add it to os. It's adding to the community and it's making the map better. We have he also done boundary reports in Mexico through use of INEGI. We have plans to move forward with that as we progress for it. So at Telenav, we're always looking for new ways to find new ways to add more date to to the OSM as much as we can, and what we've found is we have a wealth of pro-data through our Scout app and also through other sources where we're able to take the probe data and we can find the direction of the road that was traveled and the direction it's going, as well. So the best way to -- for me to explain how probe data is, it's more like DNA. It tells a story of where the location was from the start to end of that trip. And it also includes data such as a lat-long every time it drops a probe of where it was recorded, the heading and also the time stamp. So what can we do with that type of data? So one example is we've last year we were able to identify a few places in Texas where there's constantly highway improvements. They're adding multiple toll booth, widening the lanes, and we were able to look at the probe data that there are sometimes where the purple dots are traveling on a road that's not -- that does not exist in OSM today, and from that we're able to identify all these locations and we're able to go back and add and edit the locations like this to have it match reality. [laughter] So just to go back, you can see the original map, it's just like a very standard highway. It's just a one single on ramp and off ramp, and once we were able to look at it again, take a closer look by contacting the DOT in Texas and also through satellite images, we're able to see there's a lot more information there. And especially with Texas, we're able to find a lot of places are like this. They're constantly adding new toll boots. So what else can you do with probe data other than missing roads? We've found that we can also use the data to identify the direction of traffic flow, whether the road is a one-way or not, and also to identify turn restrictions. So I will go into a little bit of how this works, but before I do, just to -- we came up with this tool called ImproveOSM and we launched it back in November and since then we were able to add thousands of missing roads and direction of traffic flow and also hundreds of turn restrictions. So you can see here, most of the additions are in the United States, and but we do have probe data around the world, so it just adds to the total number that we have. For missing roads, how does it work? It's similar to the example I just showed a few minutes ago. Where we take the probe data, we lay it over the existing OSM map and what we do is extract the highway tags from it, and you know, we take the entire trip of that probe data, from the beginning, middle and end. For the direction of traffic flow, we have an algorithm where we look for trips that are passed through at least 100 times and the segment is less than 20 meters. This is a way for us to build confidence to show what are the -- whether a road is in fact a one-way or not, because if in fact it's only been traveled five times, the confidence level is a lot lower: And the same with the missing roads. We extract all the highway tags and edges such as service roads, private roads and one-ways are filtered out. The lower level roads and the one of ways that we already know as a fact is a one-way are left out of this algorithm. For turn restriction, we do the same thing. We would -- we take the probe data, we lay it over an intersection, and we look for the start segment and the end segment, and we look for the number of times the probes are not traveling over to a side of the road. let's say we're at an intersection, we plot down hundreds of probe data at an intersection and you can see that a lot of people are either going straight or right, and very little to no left turns, so with that we're able to with a level of confidence show that this is -- there's probably a sign there that says there's no left turn there. So we calculate the level of confidence and then we generate side files and we import it into ImproveOSM. So now I've talked about how it works, well, how does it look like? So if you go to ImproveOSM.org this is the first thing you see. There is a catch, though, if you want to edit, you need to log in with your OSM account. I assume no one here does not have an OSM account. You can still look around, you can browse around the site, but you cannot make any edits and the heat map will show you where all these different locations are where we've identified locations that are either missing a turn restriction, a missing road, or a highway that needs to be validated. And the site also has permalinks, so even if you are a guest and you're not making edits, if you can still zoom into a certain level, copy the link from the search bar and you can paste it and share with your friends. You may be in a situation where you're not sure if the intersection has a turn restriction or not, but you might know someone who might,so you can always copy and paste that link over to them. So just to zoom in on the right side of the bar, this is more of a filter. On the tile layer, the basic map is an OSM map. You can switch over to ESRI, and you can also sort by the status of the type of roads you want to look at so the open ones are the one that need to be validated. Whereas the solve ones are already solved by the community. And you can also choose the type of edits that you want to make, so if you want to do a one-way -- you want to edit one-way roads, you can select a list of three possibilities. Here, highly probable, most likely, and probable. So the probable ones are the ones that we're a little bit unsure of, so for those, it's a little bit more challenging challenging and I believe local knowledge is useful in these type of situations, so you're able to select from these three or you can have everything checked. So here's a fairly obvious example of an on ramp for a one-way. So at this -- in this image, it does not show a one-way direction, and so on the box on the left, there's information about that segment. So that on ramp has been traveled on over 7,000 times with a confidence level of 99.9%, and also includes other information such as where it's located, the type of road it is, and the confidence level, so just looking at the -- that map, you can tell that it's obviously an on ramp and the number of trips that drive on it gives you the confidence to know that it should be an on ramp, so it should be a one-way direction on ramp. And if you want to make the edit, you can choose by doing the edit in OSM.org or through JOSM and once you make that edit of course, you need to make a comment. It's very helpful and you can change the status from there. For the invalids, one of the small side effects we have is that when we plot the probe data, there's oftentimes where the driver will drive into parking lots, so we're able to see, you know, certain places like this where there is a concentrated level of probe data, in parking lots that are not actual roads. So those can be kind of sorted out by marking as invalid, just to sort it out and we can look at other places for the true missing roads. So my next slide, I have a quick video. There's no audio here, but it just quickly shows how the edit is done. So since we're in Seattle, we can zoom in, to look for a one-way, so you can quickly zoom in, go to the arrow, switch over to the satellite image and just click on it and you can see the information on the left. So the next example is for the turn restrictions. You can click on it and you can see the same information, as well, and by adding more information, you can click on probable, just so you see the level of confidence, and for the last one is the missing roads. We have a tile -- we have tiles that will show in certain areas with the probe data on it, so here you can see on the map that there's a road missing, but if you zoom in, you can see a concentrated level of probes on the map, and when you switch over to satellite images, you can see that there is actually a road there. So the user switched over to JOSM, and now they're just kind of like padding over just to see and validating what the different satellite images confirm that there actually is a road there, and from there, you can go in and make the necessary edits. AUDIENCE MEMBER: [inaudible] >> Well, there's oftentimes where there's not enough of it but it's more concentrated in some areas. This is just to getting to identify a certain area. Just like your side with the on ramp, we're able to identify one part of it, but once you go in there, you can see that there's a lot of other roads that are merging from there. So again, add your comment, save it, and then click solved and that's it. So once you -- so it should go live over an OSM shortly after that. So we're really committee about the next thing for ImproveOSM. In the next couple of weeks we're going to add ImproveOSM to ID as a layer. So you can still go to ImproveOSM.org now or even when this launches as a separate place to make your edits, but once this launches, you can make your edits directly from iD. So the look will be different but the core usability will be the same. You can see the circles on the map. There's three different colors to identify the three different types of edits that you want to make. Just to zoom in a little bit, you can see on the right side there's a layer that you can toggle to check, ImproveOSM. Once you check that, you'll see all these circles that -- to identify all these locations that need to be edited. And at a lower level, at CenturyLink, just a short distance from here, there's a no left turn there. So at this time when I pulled this screenshot out, I didn't make the edit. It's still there. Just as an example to show that, you know, there's -- we -- there are still a few places officially in the downtown area where there still needs to be edits that need to be made and using a tool like ImproveOSM will be able to identify things like this. So thanks for your time. If you want to be up to date on ImproveOSM, you can follow us on Twitter at ImproveOSM, and make sure you stop by the booth, and tomorrow I just want to make a quick plug for the presentation for open street view, tomorrow at 3 p.m. at the Pigott building where we were just at, and if you stop by our booth, we do have a gift bags. Some of you may have seen, I was there in the morning and I was able to meet a lot of you, and it's -- I'll be back there later again to say hi to you guys and we also have a daily prize drawing, so please stop by just to talk about, if you have questions about ImproveOSM or about OpenStreetView. Thank you. [applause] In the recording false positives, because the data, when you get false positives from humans looking at this output? AUDIENCE MEMBER: What was the question? >> Sure, the question was how do you detect false positives. >> And how do you iterate your own detection algorithms based on the false positives. >> Right, from our algorithm we rely on the number of times that the probe travels through that type of road. So it's more obvious for the one-way directions or the missing roads, but for turn restrictions it's a lot trickier. So we have to adjust the algorithm to make sure that it's a little bit more sensitive to those kind of place, especially with turn restriction, because just because no one is making a left turn, it could be because of a time restriction, so for those local knowledge is really preferred, so it's -- but we try to tweak the algorithm as much as we can just to make sure that we don't run into these false positives. AUDIENCE MEMBER: How do you deal with restricted roads? I'm guessing you will sometimes get those -- some subset of traffic that -- [inaudible] >> Right, in that case if we don't have that data, then it's not shown on the map. These are only on places where we do have people drive on the roads, so oftentimes they are -- they somehow the user does drive on the restricted roads, but it's oftentimes not enough for us to really pick it up. AUDIENCE MEMBER: Do you have any idea what percent of your users are truck drivers or equipment operators, versus automobiles? Because I fixed a lot of truck parking lots, so where I live, where almost all the roads are tagged as residential, if I change it to service, that would then [inaudible] >> That's correct, yes. AUDIENCE MEMBER: So I'm changing lots of parking lot to residential to service. >> So hopefully it is a service road, right? So if it is for this tool, it will drop it. But if it's a residential road, then it would be picked up once a certain number of trips have been driven past it. AUDIENCE MEMBER: Are there future plans to [inaudible] >> I'm sorry, can you repeat that? AUDIENCE MEMBER: Hi, is there a future plan to acquire an input data through vehicle to vehicle and vehicle to infrastructure technologies? Like, connecting ITS technologies? >> At this time we don't have plans for that, but that's something we can look into that I think is very useful. AUDIENCE MEMBER: You mentioned time restrictions thing. You have the time stamps and a lot of turn restrictions and directionality restrictions are time restricted. Do you have any plans to start detecting those kinds of conditions, as well? >> Yeah, so at Telenav last year, we did send out vehicles in a few different cities and we were able to extract this type of information for turn restrictions. The tricky thing right now is that with OSM, we still don't know the proper way to tag time restriction. I mean I looked on the wiki page, and there's different methods that people have and we haven't concluded on the correct one to really use, but that is something that we would add in as a additional feature. AUDIENCE MEMBER: Hi. Have you considered using speed to try and info, maybe speed limits or something like that? >> For this tool, no. Because people can speed in residential areas, or also on highways, so it's hard to tell. And so we still -- it's much better to really rely on ground truth in those situations, instead of relying on probe data, because we're afraid of I guess the false positive that we can get from that. >> [inaudible] >> It's about 2 seconds. We will take another location from that probe. So it's about a second or two, so of course the more concentrated the better. If it's scattered within a minute or two, it's pretty much useless,especially for turn restrictions. If you're in one spot and the next minute, the next point is somewhere else at a cross street, it's really useless. We don't know where they went prior to that. >> [inaudible] AUDIENCE MEMBER: How much data do you need to be confident? How many passes in that intersection on the turn restrictions? >> We feel more confident when it's in the 90% range. Again, the turn restriction is very tricky so we want to be as confident as possible. That can really screw up routing so we are very sensitive to that. In the back? AUDIENCE MEMBER: Hi, how do you correct or I guess try and notice temporary road closures or changes, I'm thinking about maybe construction that reroutes so that a one-way street may look bidirectional for a few months? >> For this probe data, we collect it over a long period of time. For road construction, we tend to look -- if we identify for some reason on a major highway, you know, no one's traveling through it, then we would go to the local government website and see if they have a major construction. Usually they do have that posted on their website and from there, if they don't have a completion date, then would I personally contact them and again, they're always very quick to respond to let me know when they expect to finish the update with the road construction. So once it becomes available again, what we do is we go back and check it or I'll contact them again, is this officially open now and if yes, then we'll make the edit to say that the road is now open. AUDIENCE MEMBER: How precise is the data? Is it enough to get multi-lanes that are coming from the highway or maybe it's the number of exit lanes with the highway? >> Not with this tool. I think for tomorrow's talk the OpenStreetView, I think that's a better way of identifying the number of lanes. Another way is just identifying these type of roads and zoom in to the satellite view, and hopefully there's enough detail to see the lanes that are on there. >> OK, thank you very much. [applause]